Reactions � MAS962 Computational semantics

Greg Detre

Tuesday, October 01, 2002

Miller, �Introduction to Wordnet�

�WordNet is an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, and adjectives are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets.�

�The price of imposing this syntactic categorization on WordNet is a certain amount of redundancy that conventional dictionaries avoid�words like back, for example, turn up in more than one category. But the advantage is that fundamental differences in the semantic organization of these syntactic categories can be clearly seen and systematically exploited. As will become clear from the papers following this one, nouns are organized in lexical memory as topical hierarchies, verbs are organized by a variety of entailment relations, and adjectives and adverbs are organized as N-dimensional hyperspaces. Each of these lexical structures reflects a different way of categorizing experience; attempts to impose a single organizing principle on all syntactic categories would badly misrepresent the psychological complexity of lexical knowledge.�

�Lexical semantics begins with a recognition that a word is a conventional association between a lexicalized concept and an utterance that plays a syntactic role.�

�Since the word ��word�� is commonly used to refer both to the� utterance and to its associated concept, discussions of this lexical association are vulnerable to terminological confusion. In order to reduce ambiguity, therefore, ��word form�� will be used here to refer to the physical utterance or inscription and ��word meaning�� to refer to the lexicalized concept that a form can be used to express. Then the starting point for lexical semantics can be said to be the mapping between forms and meanings (Miller, 1986). A conservative initial assumption is that different syntactic categories of words may have different kinds of mappings.�

constructive vs differential definitions (in place of meaning)

�These synonym sets (synsets) do not explain what the concepts are; they merely signify that the concepts exist. People who know English are assumed to have already acquired the concepts, and are expected to recognize them from the words listed in the synset.�

�A lexical matrix, therefore, can be represented for theoretical purposes by a mapping between written words and synsets. Since English is rich in synonyms, synsets are often sufficient for differential purposes. Sometimes, however, an appropriate synonym is not available, in which case the polysemy can be resolved by a short gloss,�

synonymy:

�According to one definition (usually attributed to Leibniz) two expressions are synonymous if the substitution of one for the other never changes the truth value of a sentence in which the substitution is made. By that definition, true synonyms are rare, if they exist at all. A weakened version of this definition would make synonymy relative to a context: two expressions are synonymous in a linguistic context C if the substitution of one for the other in C does not alter the truth value.�

�That is to say, if concepts are represented by synsets, and if synonyms must be interchangeable, then words in different syntactic categories cannot be synonyms (cannot form synsets) because they are not interchangeable. Nouns express nominal concepts, verbs express verbal concepts, and modifiers provide ways to qualify those concepts. In other words, the use of synsets to represent word meanings is consistent with psycholinguistic evidence that nouns, verbs, and modifiers are organized independently in semantic memory.�

they take synonymy (even when defined in terms of substitutability in truth-conditions, is a continuum, and symmetrical)

antonymy

�Antonymy is a lexical relation between word forms, not a semantic relation between word meanings. For example, the meanings {rise, ascend} and {fall, descend} may be conceptual opposites, but they are not antonyms; [rise/fall] are antonyms and so are [ascend/descend], but most people hesitate and look thoughtful when asked if rise and descend,orascend and fall, are antonyms. Such facts make apparent the need to distinguish between semantic relations between word forms and semantic relations between word meanings. Antonymy provides a central organizing principle for the adjectives and adverbs in WordNet, and the complications that arise from the fact that antonymy is a semantic relation between words are better discussed in that context.�

hyponymy

�hyponymy/hypernymy is a semantic relation between word meanings: e.g., {maple}is a hyponym of {tree}, and {tree} is a hyponym of {plant}.�

�For example, maple inherits the features of its superordinate, tree, but is distinguished from other trees by the hardness of its wood, the shape of its leaves, the use of its sap for syrup, etc. This convention provides the central organizing principle for the nouns in WordNet.�

�Another relation sharing these advantages�a semantic relation�is the part-whole (or HASA) relation, known to lexical semanticists as meronymy/holonymy. A concept represented by the synset {x, x�,...} is a meronym of a concept represented by the synset {y, y�,...} if native speakers of English accept sentences constructed from such frames as A y has an x (as a part) or An x is a part of y. The meronymic relation is transitive (with qualifications) and asymmetrical (Cruse, 1986), and can be used to construct a part hierarchy (with some reservations, since a meronym can have many holonyms). It will be assumed that the concept of a part of a whole can be a part of a concept of the whole, although it is recognized that the implications of this assumption deserve more discussion than they will receive here.�

�These and other similar relations serve to organize the mental lexicon. They can be represented in WordNet by parenthetical groupings or by pointers (labeled arcs) from one synset to another.�

Wordnet incorporates inflectional morphology

Nouns in Wordnet

Abstract:

�Distinguishing features are entered in such a way as to create a lexical inheritance system, a system in which each word inherits the distinguishing features of all its superordinates. Three types of distinguishing features are discussed: attributes (modification), parts (meronymy), and functions (predication), but only meronymy is presently implemented in the noun files. Antonymy is also found between nouns, but it is not a fundamental organizing principle for nouns. Coverage is partitioned into twenty-five topical files, each of which deals with a different primitive semantic component.�

�In terms of coverage, WordNet�s goals differ little from those of a good standard handheld collegiate-level dictionary.�

WordNet does not aim to cover proper nouns

�a good dictionary is a remarkable store of information: � different kinds of information packed into lexical entries: spelling, pronunciation, inflected and derivative forms, etymology, part of speech, definitions and illustrative uses of alternative senses, synonyms and antonyms, special usage notes, occasional line drawings or plates� etc.

the underlying logic of a dictionary: superordinate plus distinguishers

but if someone asks how to improve a dictionary �

�What is missing from this definition [e.g. of �tree�: felicitous�a large, woody, perennial plant with a distinct trunk,]? Anyone educated to expect this kind of thing in a dictionary will not feel that anything is missing. But the definition is woefully incomplete. It does not say, for example, that trees have roots, or that they consist of cells having cellulose walls, or even that they are living organisms. Of course, if you look up the superordinate term, plant, you may find that kind of information�unless, of course, you make a mistake and choose the definition of plant that says it is a place where some product is manufactured. There is, after all, nothing in the definition of tree that specifies which sense of plant is the appropriate superordinate. That specification is omitted on the assumption that the reader is not an idiot, a Martian, or a computer. But it is instructive to note that, even though intelligent readers can supply it for themselves, important information about the superordinate term is missing from the definition.�

�Second, this definition of tree contains no information about coordinate terms. The existence of other kinds of plants is a plausible conjecture, but no help is given in finding them.�

�Third, a similar challenge faces a reader who is interested in knowing the different kinds of trees. � The prototypical definition points upward, to a superordinate term, not sideways to coordinate terms or downward to hyponyms.�

�Fourth, everyone knows a great deal about trees that lexicographers would not include in a definition of tree. For example, trees have bark and twigs, they grow from seeds, adult trees are much taller than human beings, they manufacture their own food by photosynthesis, they provide shade and protection from the wind, they grow wild in forests, their wood is used in construction and for fuel, and so on. Someone who was totally innocent about trees would not be able to construct an accurate concept of them if nothing more were available than the information required to define tree. A dictionary definition draws some important distinctions and serves to remind the reader of something that is presumed to be familiar already; it is not intended as a catalogue of general knowledge. There is a place for encyclopedias as well as dictionaries.�

�Note that much of the missing information is structural, rather than factual. That is to say, lexicographers make an effort to cover all of the factual information about the meanings of each word, but the organization of the conventional dictionary into discrete, alphabetized entries and the economic pressure to minimize redundancy make the reassembly of this scattered information a formidable chore.�

Lexical inheritance systems

�Since words are used to define words, how can lexicography escape circularity?�

�The fundamental design that lexicographers try to impose on the semantic memory for nouns is not a circle, but a tree (in the sense of tree as a graphical representation). It is a defining property of tree graphs that they branch from a single stem without forming circular loops.�

Relations:

The semantic relation that is represented above by �@�� has been called the ISA relation, or the hypernymic or superordinate relation

The inverse semantic relation �~�� goes from generic to specific (from superordinate to hyponym) and so is a specialization.

Caveats:

�It should be noted, at least parenthetically, that WordNet assumes that a distinction can always be drawn between synonymy and hyponymy. In practice, of course, this distinction is not always clear�

�somewhere a line must be drawn between lexical concepts and general knowledge, and WordNet is designed on the assumption that the standard lexicographic line is probably as distinct as any could be�

Psycholinguistic assumptions

�Since WordNet is supposed to be organized according to principles governing human lexical memory, the decision to organize the nouns as an inheritance system reflects a psycholinguistic judgment about the mental lexicon. What kinds of evidence provide a basis for such decisions?�

�The isolation of nouns into a separate lexical subsystem receives some support from clinical observations of patients with anomic aphasia. After a left-hemisphere stroke that affects the ability to communicate linguistically, most patients are left with a deficit in naming ability (Caramazza and Berndt, 1978). In anomic aphasia, there is a specific inability to name objects. When confronted with an apple, say, patients may be unable to utter ��apple,�� even though they will reject such suggestions as shoe or banana, and will recognize that apple is correct when it is provided. They have similar difficulties in naming pictured objects, or in providing a name when given its definition, or in using nouns in spontaneous speech. Nouns that occur frequently in everyday usage tend to be more accessible than are rarely used nouns, but a patient with severe anomia looks for all the world like someone whose semantic memory for nouns has become disconnected from the rest of the lexicon. However, clinical symptoms are characterized by great variability from one patient to the next, so no great weight should be assigned to such observations.

�Psycholinguistic evidence that knowledge of nouns is organized hierarchically comes from the ease with which people handle anaphoric nouns and comparative constructions. (1) Superordinate nouns can serve as anaphors referring back to their hyponyms. For example, in such constructions as He owned a rifle, but the gun had not been fired, it is immediately understood that the gun is an anaphoric noun with a rifle as its antecedent. Moreover, (2) superordinates and their hyponyms cannot be compared (Bever and Rosenbaum, 1970). For example, both A rifle is safer than a gun and A gun is safer than a rifle are immediately recognized as semantically anomalous. Such judgments demand an explanation in terms of hierarchical semantic relations.

�More to the point, however, is the question: is there psycholinguistic evidence that people�s lexical memory for nouns forms an inheritance system? The first person to make this claim explicit seems to have been Quillian (1967, 1968). Experimental tests of Quillian�s proposal were reported in a seminal paper by Collins and Quillian (1969), who assumed that reaction times can be used to indicate the number of hierarchical levels separating two meanings. They observed, for example, that it takes less time to respond True to ��A canary can sing�� than to ��A canary can fly,�� and still more time is required to respond True to ��A canary has skin.�� In this example, it is assumed that can sing is stored as a feature of canary, can fly as a feature of bird, and has skin as a feature of animal. If all three features had been stored directly as features of canary, they could all have been retrieved with equal speed. The reaction times are not equal because additional time is required to retrieve can fly and has skin from the superordinate concepts. Collins and Quillian concluded from such observations that generic information is not stored redundantly, but is retrieved when needed. (In WordNet, the hierarchy is: canary @� finch @� passerine @� bird @� vertebrate @� animal, but these intervening levels do not affect the general argument that Collins and Quillian were making.)

�Most psycholinguists agree that English common nouns are organized hierarchically in semantic memory, but whether generic information is inherited or is stored redundantly is still moot (Smith, 1978). The publication of Collins and Quillian�s (1969) experiments stimulated considerable research, in the course of which a number of problems were raised. For example, according to Quillian�s theory, robin and ostrich share the same kind of semantic link to the superordinate bird, yet ��A robin is a bird�� is confirmed more rapidly than is ��An ostrich is a bird�� (Wilkins, 1971). Or, again, can move and has ears are both properties that people associate with animal, yet ��An animal can move�� is confirmed more rapidly than is ��An animal has ears�� (Conrad, 1972). From these and similar results, many psycholinguists concluded that Quillian was wrong, that semantic memory for nouns is not organized as an inheritance system.

�An alternative conclusion�the conclusion on which WordNet is based�is that the inheritance assumption is correct, but that reaction times do not measure what Collins and Quillian, and other experimentalists assumed they did. Perhaps reaction times indicate a pragmatic rather than a semantic distance�a difference in word use, rather than a difference in word meaning (Miller and Charles, 1991).�

Semantic components

you can either put some semantically-empty abstract component at the top of the hierarchy or partition the nouns with a set of semantic primes (generic concepts, each at the top of separate hierarchies)

{act, action, activity}

{animal, fauna}

{artifact}

{attribute, property}

{body, corpus}

{cognition, knowledge}

{communication}

{event, happening}

{feeling, emotion}

{food}

{group, collection}

{location, place}

{motive}

{natural object}

{natural phenomenon}

{person, human being}

{plant, flora}

{possession}

{process}

{quantity, amount}

{relation}

{shape}

{state, condition}

{substance}

{time}

�The problem, of course, is to decide what these primitive semantic components should be. � One important criterion is that, collectively, they should provide a place for every English noun.�

�These hierarchies vary widely in size and are not mutually exclusive�some cross-referencing is required�but on the whole they cover distinct conceptual and lexical domains. They were selected after considering the possible adjective-noun combinations that could be expected to occur� (???) (Johnson-Laird)

the 25 categories could be partly grouped in a top level (e.g. into living/non-living)

�Lexical inheritance systems, however, seldom go more than ten levels deep, and the deepest examples usually contain technical levels that are not part of the everyday vocabulary.�

Distinguishing features

�These hierarchies of nominal concepts are said to have a level, somewhere in the middle, where most of the distinguishing features are attached.�

�Above the basic level, descriptions are brief and general. Below the base level, little is added to the features that distinguish basic concepts. These observations have been made largely for the names of concrete, tangible objects, but some psycholinguists have argued that a base or primary level should be a feature of every lexical hierarchy (Hoffman and Ziessler, 1983).�

�It must be possible to associate canary appropriately with at least three different kinds of distinguishing features (Miller, in press):

(1) Attributes: small, yellow (adjectives)

(2) Parts: beak, wings (nouns)

(3) Functions: sing, fly� (verbs)

In 1993, �only the pointers to parts, which go from nouns to nouns, have been implemented.�

�As more distinguishing features come to be indicated by pointers, these glosses should become even more redundant. An imaginable test of the system would then be to write a computer program that would synthesize glosses from the information provided by the pointers.�

Attributes and modification

�adjectives are said to modify nouns, or nouns are said to serve as arguments for attributes: Size(canary)=small�

this relation is not symmetric

�Here it is sufficient to point out that the attributes associated with a noun are reflected in the adjectives that can normally modify it. For example, a canary can be hungry or satiated because hunger is a feature of animals and canaries are animals, but a stingy canary or a generous canary could only be interpreted metaphorically, since generosity is not a feature of animals in general, or of canaries in particular.�

�Keil (1979, 1983) has argued that children learn the hierarchical structure of nominal concepts by observing what can and cannot be predicated at each level. For example, the important semantic distinction between animate and inanimate nouns derives from the fact that the adjectives dead and alive can be predicated of one class of nouns but not of the other.�

Parts and meronymy

�The part-whole relation between nouns is generally considered to be a semantic relation, called meronymy (from the Greek meros, part; Cruse, 1986), comparable to synonymy, antonymy, and hyponymy. The relation has an inverse: if W m is a meronym of W h , then W h is said to be a holonym of W m .�

�Meronyms are distinguishing features that hyponyms can inherit. Consequently, meronymy and hyponymy become intertwined in complex ways.�

�Although the connections may appear complex when dissected in this manner, they are rapidly deployed in language comprehension. comprehension. For example, most people do not even notice the inferences required to establish a connection between the following sentences: It was a canary. The beak was injured.�

�It has been said that distinguishing features are introduced into noun hierarchies primarily at the level of basic concepts; some claims have been made that meronymy is particularly important for defining basic terms (Tversky and Hemenway, 1984).�

�The ��part of�� relation is often compared to the ��kind of�� relation: both are asymmetric and (with reservations) transitive, and can relate terms hierarchically (Miller and Johnson-Laird, 1976).�

�� it sounds odd to say ��The house has a handle�� or ��The handle is a part of the house.�� Winston, Chaffin, and Hermann (1987) take such failures of transitivity to indicate that different part-whole relations are involved in the two cases. For example, ��The branch is a part of the tree�� and ��The tree is a part of a forest�� do not imply that ��The branch is a part of the forest�� because the branch/tree relation is not the same as the tree/forest relation.�

�Such observations raise questions about how many different ��part of�� relations there are. Winston et al. (1987) differentiate six types of meronyms: component-object (branch/tree), member-collection (tree/forest), portion-mass (slice/cake), stuff-object (aluminum/airplane), feature-activity (paying/shopping), and place-area (Princeton/New Jersey). Chaffin, Hermann, and Winston (1988) add a seventh: phase-process (adolescence/growing up). Meronymy is obviously a complex semantic relation�or set of relations.

Only three of these types of meronymy are coded in WordNet:

Wm #p� Wh indicates that Wm is a component part of Wh;

Wm #m� Wh indicates that Wm is a member of Wh; and

Wm #s� Wh indicates that Wm is the stuff that Wh is made from.

Of these three, the �is a component of� relation �#p� is by far the most frequent.�

�For commonsense purposes, the dissection of an object terminates at the point where the parts no longer serve to distinguish this object from others with which it might be confused. Knowing where to stop requires commonsense knowledge of the contrasts that need to be drawn.�

�Tangled hierarchies are rare when hyponymy is the semantic relation. In meronymic hierarchies, on the other hand, it is common; point, for example, is a meronym of arrow, awl, dagger, fishhook, harpoon, icepick, knife, needle, pencil, pin, sword, tine; handle has an even greater variety of holonyms. Since the points and handles involved are so different from one holonym to the next, it is remarkable that this situation causes as little confusion as it does.�

Functions and Predication

�A functional feature of a nominal concept is intended to be a description of something that instances of the concept normally do, or that is normally done with or to them.�

�the uses to which a thing is normally put are a central part of a person�s conception of that thing.�

�There are also linguistic reasons to assume that a thing�s function is a feature of its meaning. Consider the problem of defining the adjective good. A good pencil is one that writes easily, a good knife is one that cuts well, a good paint job is one that covers completely, a good light is one that illuminates brightly, and so on. � It is unthinkable that all of these different meanings should be listed in a dictionary entry for good.�

�One solution is to define (one sense of) good as �performs well the function that its head noun is intended to perform� (Katz, 1964).�

�In terms of the present approach to lexical semantics, functional information should be included by pointers to verb concepts, just as attributes are included by pointers to adjective concepts. In many cases, however, there is no single verb that expresses the function. And in cases where there is a single verb, it can be circular. For example, if the noun hammer is defined by a pointer to the verb hammer, both concepts are left in need of definition. More appropriately, the noun hammer should point to the verb pound ��

�however: what is the function of apple or cat?�

�Although functional pointers from nouns to verbs have not yet been implemented in WordNet, the hyponymic hierarchy itself reflects function strongly. For example, a term like weapon demands a functional definition, yet hyponyms of weapon�gun, sword, club, etc.�are specific kinds of things with familiar structures (Wierzbicka, 1984). Indeed, many tangles in the noun hierarchy result from the competing demands of structure and function. Particularly among the human artifacts there are things that have been created for a purpose; they are defined both by structure and use, and consequently earn double superordinates.�

Antonymy

�The strongest psycholinguistic indication that two words are antonyms is that each is given on a word association test as the most common response to the other.�

�Semantic opposition is not a fundamental organizing relation between nouns, but it does exist and so merits its own representation in WordNet.

For example, the synsets for man and woman would contain:

{[man, woman,!], person,@...(a male person) }

{[woman, man,!], person,@...(a female person) }

where the symmetric relation of antonymy is represented by the �!� pointer, and square brackets indicate that antonymy is a lexical relation between words, rather than a semantic relation between concepts.�

�When all three kinds of semantic relations�hyponymy, meronymy, and antonymy�are included, the result is a highly interconnected network of nouns. A graphical representation of a fragment of the noun network is shown in Figure 2. There is enough structure to hold each lexical concept in its appropriate place relative to the others, yet there is enough flexibility for the network to grow and change with learning.

Pustejovsky, �Type Construction and the Logic of Concepts�

Abstract

I would like to pose a set of fundamental questions regarding the constraints we can place on the structure of our concepts,particularly as revealed through language.I will outline a methodology for the construction of ontological type based on the dual concerns of capturing linguistic generalizations and satisfying metaphysical considerations.I discuss what �kinds of things� there are, as reflected in the models of semantics we adopt for our linguistic theories. I argue that the flat and relatively homogeneous typing models coming out of classic Montague Grammar are grossly inadequate to the task of modelling and describing language and its meaning. I outline aspects of a semantic theory (Generative Lexicon) employing a ranking of types. I distinguish furst between natural (simple) type and functional types,and then motivate the use of complex type (dot objects)to model objects with multiple and interdependent denotations.This approach will be called the Principle of Type Ordering. I will explore what the top lattice structures are within this model, and how these constructions relate to more classic issues in syntactic mapping from meaning.

Discarded

At the time of Miller�s writing, derivational morphology had not been included, and it doesn�t appear to have been since then either.

Questions

Wordnet

does Wordnet incorporate derivational morphology yet???

am dubious about Collins & Quillian�s results about the reaction times of questions moving up/down different levels of superordinacy(sp???) � if they�d asked if a rat has legs, people would have had little trouble, but it�s the fact that we don�t think of birds in general as having skin

Pustejovsky

what exactly is he trying to do???

what�s Montague Grammar???

telic???